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Abstract 

Gene-derived simple sequence repeats (genie SSRs), also known as functional markers, are often pre- 
ferred over random genomic markers because they represent variation in gene coding and /or regulatory 
regions. We characterized 544 genie SSR loci derived from 138 candidate genes involved in wood forma- 
tion, distributed throughout the genome of Populus tomentosa, a key ecological and cultivated wood pro- 
duction species. Of these SSRs, three-quarters were located in the promoter or intron regions, and 
dinucleotide (59.7%) and trinucleotide repeat motifs (26.5%) predominated. By screening 1 5 wild P. 
tomentosa ecotypes, we identified 188 polymorphic genie SSRs with 861 alleles, 2-7 alleles for each 
marker. Transferability analysis of 30 random genie SSRs, testing whether these SSRs work in 26 geno- 
types of five genus Populus sections (outgroup, Salix matsudana), showed that 72% of the SSRs could be 
amplified in Turanga and 100% could be amplified in Leuce. Based on genotyping of these 26 genotypes, 
a neighbour-joining analysis showed the expected six phylogenetic groupings. In silico analysis of SSR vari- 
ation in 220 sequences that are homologous between P. tomentosa and Populus trichocarpa suggested that 
genie SSR variations between relatives were predominantly affected by repeat motif variations or flanking 
sequence mutations. Inheritance tests and single-marker associations demonstrated the power of genie 
SSRs in family-based linkage mapping and candidate gene-based association studies, as well as marker- 
assisted selection and comparative genomic studies of P. tomentosa and related species. 
Keywords: candidate gene-derived SSRs; cross-species transferability; in silico analysis of SSR variations; Populus 
tomentosa; single marker-trait association mapping 



1 . Introduction 

Poplars {Populus spp.) are widely distributed all over 
the world and are an important commercial tree 
species for timber production. In addition to their im- 
portant economic value, in environmental protection, 
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poplars also play key pioneer roles in the stability and 
sustainability of forest ecosystems. 1,2 Many members 
of the genus Populus have been physiologically and 
genetically characterized based on their desirable bio- 
logical characteristics, such as rapid growth, easy 
transformation, modest genome size, and ability to 
make interspecific crosses, propagate vegetatively. 1,2 
Thus, poplars have become a model species for 



© The Author 2012. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. 

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http:// 
creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, 
provided the original work is properly cited. 



32 



Genie SSRs in Populus 



[Vol. 20, 



studies of angiosperm trees, particularly because the 
whole genome of Populus trichocarpa has been 
sequenced and annotated (http://genome.jgi- 
psf.org/Poptr1/Poptr1.home.html). 1 Additional gen- 
omics resources include databases of expressed se- 
quence tags (ESTs) (http://www.populus.db.umu.se/ 
indexhtml) and simple sequence repeats (SSRs) (http:// 
www.ornl.gov/sci/ipgc/ssr_resource.htm), and these 
resources not only provide data for comparison of a 
long-lived perennial to short-lived model plants (e.g. 
Arabidopsis and rice), but also offer new opportunities 
to explore the genetic basis of wood formation, peren- 
nially, and dormancy. 2,3 

The Chinese white poplar (Populus tomentosa) 
belongs to the section Leuce within the Populus 
genus and is native to northern China with a distribu- 
tion zone of 1 million km 2 . P. tomentosa is of major 
commercial importance in timber and pulp produc- 
tion and also plays an important role in ecological 
and environmental protection. 4 A vast amount of 
genetic variation has arisen during the evolution of 
the species, as is evident in the natural populations. 5,6 
This accumulated genetic variation provides an im- 
portant resource for the exploration of the molecular 
mechanisms of wood formation and is also a source of 
alleles for the potential improvement of wood pro- 
ducts. However, conventional breeding programmes 
may not be sufficient to improve this long-growing 
species. 6 Modern molecular breeding tools, such as 
molecular marker-assisted selection (MAS) breeding, 
could enhance important Populus agronomic traits, 
including growth rate, wood quality, and disease re- 
sistance. Hence, development of suitable genetic 
marker resources is an important foundation for 
MAS breeding. 

Among molecular genetic markers, single nucleo- 
tide polymorphisms are often used for genetic ana- 
lysis. However, DNA microsatellites, or SSR markers, 
are excellent genetic markers because they are hyper- 
variable, co-dominant, and therefore highly inform- 
ative. 7,8 Moreover, compared with SSR markers 
derived from random genomic locations, SSR 
markers derived from genes will likely provide a 
much greater degree of resolution in association 
mapping because they occur within the gene and 
thus may affect gene expression or function.^ 8 In add- 
ition, genie SSRs exhibit relatively high transferability 
to closely related species and can be used as anchor 
markers for comparative mapping and evolutionary 
studies. 6-9 Gene-based microsatellites have now 
been developed for a limited number of Populus 
species based on the P. trichocarpa genome se- 
quence. 10,1 1 However, very limited genomic informa- 
tion is available for P. tomentosa, 4 ' ' 1 2 and as the P. 
tomentosa linkage map was constructed using ampli- 
fied fragment length polymorphisms (AFLPs), 13 



functional genomics studies of this economically im- 
portant species are in their infancy. Furthermore, 
another important approach, the use of SSRs from 
fully characterized genes or full-length cDNA clones 
has not yet been utilized in Populus. 

Wood quality traits are considered to be quantita- 
tive, controlled by multiple genes, with moderate- 
to-high heritability. 14 Currently, large numbers of 
candidate genes involved in wood formation have 
been isolated from P. tomentosa using direct sequen- 
cing methods, although many have not been pub- 
lished. 4,6,15,16 Therefore, to improve the properties 
of wood using a MAS approach, identification and 
characterization of species-specific SSR loci from 
wood formation-related genes is a highly promising 
approach. Here, we make use of the large dataset of 
available gene sequences to identify a large number 
of gene-based SSR markers for P. tomentosa. The spe- 
cific aims of our study were to: (i) characterize the 
genie SSR loci in P. tomentosa and evaluate SSR 
primers and polymorphisms in different wild-type 
varieties, (ii) test cross-species transferability within 
the genus Populus and conduct in silico analysis of 
SSR variation between P. tomentosa and P. trichocarpa, 
and (iii) examine inheritance segregation in a 
mapping population and analyze single SSR marker- 
trait association in a natural population. This study 
provides a valuable SSR resource for comparative 
genomic studies of the genus Populus, and the genie 
microsatellites also serve as 'framework' markers for 
construction of a physical map for alignment of the 
ongoing sequencing of the complete P. tomentosa 
genome. 

2. Materials and methods 

2.1 . Microsatellite identification, primer design, and 
SSR polymorphism screening 
Total genomic DNA was extracted from young 
leaves using the DNeasy Plant Mini kit (Qiagen 
China, Shanghai), following the manufacturer's proto- 
col. The reference gene models of 1 50 candidate 
genes involved in wood formation were obtained by 
BLASTX analyses against the NCBI database (http:// 
www.ncbi.nlm.nih.gov/) or from the JGI database 
(http://genome.jgi-psf.org/Poptr1_1 /Poptrl _1 .home, 
html) (Supplementary Table S1). Subsequently, a set 
of specific primers was designed for polymerase 
chain reaction (PCR) amplification of all 1 50 genes, 
and total genomic DNA (20 ng per reaction) from 
the P. tomentosa LM50 clone was used for amplifica- 
tion. All the PCR amplification products from LM50 
were sequenced (both strands) using conserved 
primers, the BigDye Terminator Cycle Sequencing 
kit, version 3.1 (Applied Biosystems, Beijing, China), 
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and a Li-Cor 4300 genetic analyzer (Li-Cor 
Biosciences, Lincoln, Nebraska, USA). In all, a total of 
696 792 bp of genomic DNA sequences from these 
150 unique candidate genes, with an average of 
4645 bp per gene, were obtained by direct sequen- 
cing, and the sequence data of the 150 candidate 
genes generated in this study were deposited in the 
GenBank Data Library under the accession numbers 
JX986583-JX986732 (Supplementary Table S1). 
These sequences were mined for genie SSR markers 
using the SSRIT software (http://www.gramene.org/ 
db/searches/ssrtool). 1 7 All microsatellite loci were 
located exclusively in the coding and/or regulatory 
regions of candidate genes, i.e. exons, promoters, 5' 
untranslated regions (UTRs), 3' UTRs, or introns. The 
ideal marker contained a minimum SSR repeat of 
five for dinucleotide repeats, four for trinucleotide 
repeats, and three for tetranucleotide or longer 
(more than pentanucleotide) repeats. A set of com- 
pound perfect repeat units was also identified. 
Mononucleotide repeats and complex sequence 
repeats were excluded. 

Primer pairs specific for the SSR flanking sequences 
were designed using the Primer 3 program (http:// 
frodo.wi.mit.edu/primer3/primer3.FAQ.html), accord- 
ing to the parameters reported by Du et al. 6 All SSR 
primers were initially screened using genomic DNA 
from the P. tomentosa LM50 clone (three replications). 
Amplification was carried out using standard PCR con- 
ditions with annealing temperatures (Ta) set accord- 
ing to the primer sequences. PCR products were 
separated by electrophoresis in 2% agarose gels. 
Electrophoresis products were visualized and photo- 
graphed using the Fluorchem™ 5500 (Alfa Innotech 
Corp., USA) gel documentation system. Finally, a 
subset of optimal SSR primers was identified and 
designated as 'validated genie SSR markers'. 

All validated genie SSR markers were scored for 
amplicon size polymorphisms among 1 5 wild P. tomen- 
tosa ecotypes that represented nearly the entire 
P. tomentosa geographic distribution (Supplementary 
Table S2). 4,12 Observed product sizes and numbers 
of alleles per locus (Na) were calculated for each 
marker using POPGENE, version 1.32. 18 

2.2. PCR amplification and SSR genotyping 

The SSR amplification reaction system and PCR 
amplifications were conducted following the proced- 
ure of Du et al. 6 PCR product amplification was con- 
firmed on a 2% agarose gel, and products were then 
separated by capillary electrophoresis using an 
ABI3730xl DNA Analyzer (Applied Biosystems). The 
polymorphic loci were analyzed using GeneMapper, 
version 4.0, with the LIZ 600 size standard (Applied 
Biosystems). 



2.3. Cross-species transferability 

To assess cross-species transferability and allele 
length polymorphisms, 30 randomly selected genie 
SSRs were genotyped in 2 6 ecotypes (24 species) 
belonging to 5 sections of the genus Populus, using 
Salix matsudana as the outgroup (Supplementary 
Table S3). 

The summary statistical parameters reflected intra- 
and interpopulation genetic diversity levels, including 
the observed allele sizes, polymorphism information 
content (PIC), expected heterozygosity (He), and 
number of alleles (Np), that were calculated for each 
marker using POPGENE, version 1.32. The discrimin- 
atory abilities of genie SSRs were estimated using 
cluster analyses to assess phylogenic relationships 
among related Populus species. A neighbour-joining 
(NJ) dendrogram was constructed using the propor- 
tion of shared alleles coefficient from PowerMarker, 
version 3.2 5 19 and was drawn using TreeView 
version 1.66 (http://taxonomy.zoology.gla.ac.uk/rod/ 
treeview.html). 

2.4. In silico analysis of SSR variations 

We used in silico identification of genie SSR variations 
between P. tomentosa and P. trichocarpa to validate the 
true SSR cross-species transferability within the genus 
Populus. All 220 available homologous sequences of P. 
trichocarpa were identified in the JGI Database, based 
on 2 20 specific amplicons with SSR markers selected 
from P. tomentosa candidate gene sequences. MEGA, 
version 5.1 (http://www.megasoftware.net/), was 
used to compare the amplified SSR alleles with the 
SSR-containing sequences between the two species 
(i.e. repeat length, repeat motif, and mutations in 
flanking sequences). 

2.5. Testing inheritance in a linkage-mapping 
population 

To test the power of these novel genie SSR markers 
for constructing a family-based linkage map, 30 
random genie SSRs selected from the 1 88 polymorph- 
ic markers (Fig. 1 and Supplementary Table S6) were 
used to examine the observed segregation of markers, 
using 1 000 Ft progeny from a controlled hybridiza- 
tion between a female YX01 clone (Populus alba x 
Populus glandulosa) and a male LM 50 clone (P. tomen- 
tosa) - 1 2 Mendelian inheritance of microsatellite var- 
iants (alleles) was determined from the observed 
distribution of progeny genotypes when compared 
with the expected segregation ratios (1:1, 1:2:1, and 
1:1:1:1), based on the hypothesized genotypes of 
the parents by performing a chi-squared (^ 2 ) test at 
the 0.01 probability level in SAS, version 8.2, (SAS 
Institute, Cary, NC, USA). 



34 



Genie SSRs in Populus 



[Vol. 20, 



Genomic DNA sequences of ISO candidate genes 
(P. tomentosa clone "I.M50") 



SSR discover)' using SSRIT 



544 SSR loci in 138 candidate genes 



SSR Primers designed using the Primer 3 software 



481 primer pairs 



PCR amplification validated in /'. tomentosa clone "1.M50" 



85 

larger or samller 
than expected size 





68 

no amplification 
product 



/;/ silico analvsis 



of SSR variations 



Comparison of SSR variations between 
P. tomentosa and P. trichocarpa 



Polymorphism tested on 15 P. tomentosa wild ecotypes 



188 

polymorphic 



24 

monomorphic 



80 have identical 
heterozygous hands 



30 



1 



|3() random genie SSK\ 



[ 30 random genie SSR>| 



140 random genic SSRs I 



Tested on 24 species 
within the genus Populus 



Validated in 
a family-based population 
(1000 progeny) 



Tested in 
P. tomentosa natural population 
(460 individuals) 



Cross-species 
transferability 



Mendelian inheritance 
test 



Association mapping 
study 



Figure 1. Flow diagram of P. tomentosa genic SSR marker development and applications in this study. 



2.6. SSR marker-trait association studies in a natural 
population 

For association mapping, we used 40 genic SSRs 
randomly selected from the set of 1 88 polymorphic 
markers (Fig. 1 and Supplementary Table S6), to asso- 
ciate with several wood quality traits in a natural 
population of 460 unrelated wild-type ecotypes 
from the P. tomentosa clonal arboretum, established 
in the national nursery of Guan Xian County, 
Shandong Province, China (36°23"N, 1 1 5°47"E), 
which represent all original provenances of this 
species (Supplementary Table S2). 4,12 

In the natural population, wood property traits, 
including microfibre angle (MFA), fibre lengths, fibre 



widths, and holocellulose, a-cellulose, and lignin con- 
tents were measured according to the methods 
described by Du et al. 20 Analysis of variance of these 
six phenotypic traits is shown in Supplementary Table 
S4. These 40 random genic SSRs were applied to test 
the degree of resolution of genic SSRs in association 
with these 6 wood property traits, using the unified 
mixed model method (MLM) with 1 0 4 permutations 
in TASSEL, version 2.0.1 . 2 '- 22 In this Q + K model, the 
relative kinship matrix (K) was obtained using TASSEL, 
and the population structure matrix (Q) of the covari- 
ates was identified by Du et al. 12 Corrections for mul- 
tiple testing were performed using the positive false 
discovery rate (FDR) method with 1 0 4 permutations. 23 
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3. Results 

3.1 . Frequency and distribution of genie SSR markers 
By sequencing 1 50 candidate genes related to 
wood properties, we identified 544 SSR loci located 
in 138 (92%) unique genes from 696 792 bp total 
sequence, for an average of 1 SSR per 1.3 kb. The 
perfect SSRs were not evenly distributed, ranging 
from zero to seven per gene (average, 3.6), and 
seven loci were compound SSRs containing at least 
one repeat motif (Fig. 1 and Supplementary Table S1 ). 

Analysis of all 544 gene-derived SSR loci revealed 
that dinucleotide (325, 59.7%) and trinucleotide 
repeat motifs (144, 26.5%) predominated, followed 
by tetranucleotide (33, 6.1%), hexanucleotide (13, 
2.4%), and pentanucleotide (1 2, 2.2%) repeat motifs 
(Fig. 2A). Of the identified SSRs, slightly fewer than 
half (46.1%) were located in promoter regions, in- 
cluding dinucleotides (57.4%) and trinucleotides 
(2 8.7%) (Fig. 2A and Supplementary Table S5A). 
Conversely, in exons, 37.5% were dinucleotides and 
62.5% were trinucleotides (Fig. 2A and Supplementary 
Table S5A). The dinucleotide repeat AT/TA was the 
most abundant motif detected in all genie SSRs (1 77, 
32.6%), followed by ATT/TAA (53, 9.8%), AG/TC (49, 



9.0%), CT/GA (48, 8.7%), and AAT/TTA (16, 3.0%) 
(Fig. 2B and Supplementary Table S5B). SSR length 
was most commonly 1 0-20 bp (70.5% of total SSRs), 
followed by 21 -30 bp (20.9%) (Fig. 2C). The largest 
SSR found was a 68-bp dinucleotide repeat (AT/TA). 

3.2. SSR primer design, screening, and polymorphism 
testing 

To determine whether the SSRs varied in length, 
and would therefore provide a useful marker, we 
designed flanking primers and determined the 
length of each SSR by capillary electrophoresis. 
Primer pairs specific for flanking sequences were 
designed for 481 of the SSR sequences (88.5%). 
Using the P. tomentosa LM50 clone, 413 primer 
pairs (85.9%) gave successful amplification, and the 
remaining 68 failed to generate PCR products at any 
annealing temperature used (Ta), and so were 
excluded from further analysis (Fig. 1). Of the 413 
working primer pairs, 292 (70.7%) amplified PCR 
products of the expected sizes; however, 65 and 20 
PCR products were larger or smaller, respectively, 
than expected, and the remaining 36 primer pairs 
generated multiple PCR products (Fig. 1). Details of 
the 292 optimal primers, including locus names, 
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Figure 2. (A) Distribution of repeat motifs in 544 genie SSRs identified in 1 38 P. tomentosa genes; (B) Distribution of dinucleotide repeat 
motifs detected in 1 38 R tomentosa genes; (C) Distribution of repeat numbers in di- and trinucleotide SSRs. Bars of different colours 
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primer sequences, SSR repeat motifs, Ta, expected 
sizes, SSR sources (locations in genes), and GenBank 
accession numbers are provided in Supplementary 
Table S6. Distribution summaries, frequencies, and 
repeat motifs of these 2 92 genie SSR markers are pro- 
vided in Supplementary Table S7A. 

To identify whether the identified SSRs varied in 
length in P. tomentosa, these 292 primers were sub- 
jected to further screening for polymorphisms in 1 5 
wild P. tomentosa ecotypes. Twenty four primers amp- 
lified monomorphic products (1 band), 80 markers 
have identical heterozygotes among all 1 5 individuals 
(2 bands), and 1 88 primers (64.4%) generated clean 
and reproducible polymorphic bands (Fig. 1 and 
Supplementary Table S6). The observed product 
sizes and numbers of alleles per locus (Na) were calcu- 
lated for each marker. A total of 861 unique alleles 
were identified (Supplementary Table S6). For the 
188 polymorphic markers from 107 genes, the N A 
ranged from 2 to 7 with an average of 3.6 observed 
alleles per polymorphic locus; in other regions, the 
mean N A values varied between 2.4 for exons and 
4.0 for promoter regions (Supplementary Tables S6 
and S7B). 

3.3. SSR cross-species transferability within the genus 
Populus 

To determine the utility of this SSR marker set 
beyond P. tomentosa, we screened a subset of the 
markers to determine whether they could be ampli- 
fied from other related species, including Populus 
and the closely related Salix species. The capacity of 
30 representative P. tomentosa genie SSR primers to 
screen for polymorphisms was tested in 2 7 ecotypes 
of related Populus and Salix species (Supplementary 
Table 3). All tested SSR primer pairs displayed a high 
amplification frequency within and across sections 
at the subgenus level within Populus. Of the 30 exam- 
ined SSR markers, 12 (40%) successfully amplified in 
all species. The transferability of the 30 primers 
tested in Populus sections was Leuce (1 00%), 
Tacamahaca (90%), Aigeros (83%), Leucoides (80%), 
and Turanga (72%) (Table 1 ). In S. matsudana, the fre- 
quency of amplification was lower, and only 50% 
yielded product (Table 1). Thus, the developed SSR 
markers could be applied across the genus Populus 
and provided data on polymorphisms among related 
species (Supplementary Fig. S1). 

We also examined the allelic variation in these eco- 
types, to determine whether these SSR markers could 
prove useful for genetic studies such as association 
mapping. Of the 2 7 ecotypes, 213 alleles were 
detected using the 30 SSR primers, with an average 
of 7.0 alleles per locus, ranging from 4 at SSR62 to 
1 1 at SSR54 (Table 1). The PIC mean polymorphism 



level of the loci was 0.67 5, and it ranged from 
0.411 for SSR65 to 0.841 for SSR54 (Table 1). In 
the five Populus sections, the mean N A values were 
from 1.1 (Leucoides) to 3.7 (Leuce). The other diversity 
parameters (observed lengths and H E ) are shown in 
Table 1 . 

We also tested whether these markers were inform- 
ative for genetic analysis by determining whether the 
marker genotypes could be used to recapitulate the 
known phylogenetic relationship among the tested 
ecotypes. The NJ tree (Fig. 3) based on the shared 
allele coefficients from these 30 SSRs revealed the 
expected genetic relationships among all 2 7 ecotypes 
and clustered them into 6 main groups, in agreement 
with 5 putative sections, and the outgroup was 
derived from the basic botanical classification of 
these tree species. 1 0,24,25 The relationships among 
species in each section are represented by the 
smaller branches within groups. In addition, we con- 
structed trees based on the other functions in 
PowerMarker, and these were very similar to the NJ 
tree (Fig. 3) in that the six distinct clusters were 
again present, although there were slight differences 
in branch lengths within clusters (data not shown). 

3.4. Comparison of SSR variation between P. tomentosa 
and P. trichocarpa 
In addition to testing a subset of markers in mul- 
tiple Populus and Salix species, we also examined 
these markers more extensively in the sequenced 
genome of P. trichocarpa. For all 292 genie SSR-con- 
taining sequences, only 75% (220) of homologous P 
trichocarpa sequences (length, 90-593 bp) were 
identified (Fig. 1). Alignment and comparison of the 
homologous sequences between these two species 
for each marker revealed 1 0 types of mutations and 
variations in SSR loci and their flanking regions 
(Table 2 and Fig. 4). For example, the SSR53 locus, 
which contains an imperfect (CT) motif, showed 
complex mutations characterized by variation in the 
repeat-motif length and point mutations in both the 
repeat motif and the flanking sequences (data not 
shown). Of all 220 genie SSRs, 39.1% (86) were 
present in P. trichocarpa, with 57 showing only vari- 
ation in repeat number in both species (conserved 
flanking sequences) and 29 that were monomorphic 
and had conserved sequences between the 2 species 
(Table 2). However, 41.8% (92) had polymorphisms 
of the SSR flanking sequences between the two 
species, suggesting that these SSR markers from 
P tomentosa are not directly transferable to P tricho- 
carpa (Table 2). A comparison of the SSR markers 
with mutations in flanking sequences showed the 
highest proportion in 3'UTRs (60.0%) and the lowest 
in 5'UTRs (25.8%). A total of 25 new SSR markers 



Table 1. Diversity of 30 polymorphic P. tomentosa genie SSR markers for 26 genotypes within the genus Populus 
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0.632 


377 




1 


0.000 


373 


-379 


4 


0.821 


375 


-377 


2 


0.429 


371 




1 


0.000 


375 




1 


0.000 


385 




1 


0.000 


SSR65 


6 


0.41 1 


263 


-273 


4 


0.725 


257 


-263 


3 


0.264 


0 




0 


/ 




265 




1 


0.000 


271 




1 


0.000 


269- 


273 


2 


1.000 


SSR66 


5 


0.463 


0 




0 


/ 


21 5 


-21 7 


2 


0.440 


199 


-219 


3 


0 


644 


221 




1 


0.000 


217 


-221 


2 


1.000 


0 




0 


/ 


SSR67 


7 


0.789 


240 


-242 


2 


0.636 


242 


-246 


3 


0.350 


242 


-250 


3 


0 


607 


246- 


250 


2 


0.667 


246 




1 


1.000 


260 




1 


0.000 


SSR69 


9 


0.834 


213 


-219 


4 


0.928 


221 


-235 


4 


0.333 


215 


-227 


4 


0 


786 


219 




1 


0.000 


219 




1 


0.000 


209- 


213 


2 


1.000 


SSR 70 


8 


0.727 


310 


-322 


4 


0.643 


312 


-318 


4 


0.733 


314 


-324 


3 


0 


607 


0 




0 


/ 


314 


-318 


2 


1.000 


0 




0 


/ 


SSR71 


1 0 


0.771 


1 92 


-228 


6 


0.849 


201 


-222 


4 


0.636 


228 


-246 


4 


0 


821 


234 




1 


0.000 


204 


-234 


2 


1.000 


237 




1 


0.000 


SSR73 


1 0 


0.828 


1 69 


-1 83 


4 


0.588 


1 77 


-191 


5 


0.835 


1 73 


-191 


6 


0 


703 


0 




0 


/ 


0 




0 


/ 


0 




0 


/ 


SSR74 


7 


0.750 


1 14 


-1 20 


3 


0.439 


1 1 4 


-1 24 


4 


0.846 


1 1 8 


-126 


4 


0 


71 1 


0 




0 


/ 


0 




0 


/ 


0 




0 


/ 


SSR 7 5 


8 


0.766 


202 


-220 


4 


0.867 


202 


-210 


4 


0.679 


204 


-210 


4 


0 


750 


206- 


208 


2 


0.667 


206 


-208 


2 


1.000 


220. 




1 


0.000 


SSR 7 7 


8 


0.690 


1 65 


-1 79 


5 


0.933 


1 63 


-169 


3 


0.604 


1 65 


-1 75 


4 


0 


679 


1 65- 


1 69 


2 


0.667 


1 65 


-1 69 


2 


1.000 


1 81 




1 


0.000 


SSR91 


5 


0.640 


1 39 


-1 51 


4 


0.636 


135 


-147 


3 


0.590 


0 




0 


/ 




147- 


1 55 


2 


0.667 


0 




0 


0 


0 




0 


/ 


SSR96 


6 


0.739 


1 1 1 


-1 23 


3 


0.725 


108 


-1 23 


4 


0.632 


1 14' 


-1 1 7 


2 


0 


596 


0 




0 


/ 


1 14 




1 


0.000 


0 




0 


/ 


SSR98 


8 


0.798 


224 


-234 


4 


0.642 


224 


-238 


5 


0.846 


226 


-236 


3 


0 


607 


226- 


228 


2 


0.667 


0 




0 


/ 


0 




0 


/ 


SSR1 1 3 


6 


0.591 


1 93 


-1 99 


3 


0.408 


1 84 


-196 


4 


0.630 


196 


-202 


3 


0 


593 


1 87- 


1 93 


2 


0.667 


1 93 




1 


0.000 


1 84- 


1 87 


2 


1.000 


SSR1 65 


8 


0.637 


310 


-322 


4 


0.636 


307 


-322 


4 


0.532 


304 


-316 


5 


0 


720 


319- 


325 


3 


0.833 


304 


-310 


2 


1.000 


322 




1 


0.000 


SSR1 69 


7 


0.664 


209 


-219 


4 


0.650 


205 


-217 


4 


0.633 


0 




0 


/ 




205- 


209 


2 


0.667 


21 1 




1 


0.000 


0 




0 


/ 


SSR1 70 


6 


0.479 


214 


-220 


4 


0.569 


202 


-220 


4 


0.705 


202 


-214 


2 


0 


51 1 


212- 


220 


3 


0.833 


202 


-208 


2 


1.000 


222 




0 


/ 


SSR1 76 


5 


0.732 


343 


-355 


4 


0.701 


346 


-355 


2 


0.360 


343 


-355 


3 


0 


531 


0 




0 


/ 


358 




1 


0.000 


0 




0 


/ 


SSR1 79 


8 


0.679 


245 


-261 


3 


0.558 


249 


-263 


4 


0.687 


253 


-261 


3 


0 


607 


263- 


265 


2 


0.667 


0 




0 


/ 


0 




0 


/ 


Mean 


7.0 


0.675 


/ 




3.4 


0.599 


/ 




3.7 


0.582 


/ 




2.6 


0.492 


/ 




1.3 


0.300 


/ 




1 .1 


0.467 


/ 




0.7 


0.233 



c 

TO 



See Supplementary Table S6 for further details of the 30 genie SSR markers. 

The observed number of alleles per locus (N A ), PIC, expected heterozygosity (H E ), and not applied (/). 
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•Vrt/iv matsudana 



P. lusiocarpa 



P. nigra var. italica 
P. nigra var. thevestina 



P. euphratica 8 

P. euphratica 4 



P. * euramericana cv.*74/76' 



P. delioides cv.'Shanhaiguan' 



ft delioides 

P. trinenis 



P. bolieana 

P. davidiana 



0.1 




P. szechuanica 

P. yunaanensis 

P. tricltocarpa 



P. adenopotla 

P. alba xP.glandulosa 
P. Irenmla 
P. alba 



P.tomeutosa 
P. lomenlosa $ 



P. ussuriensis 



P. iimonii 



P. suaveolens P. koreana 



Figure 3. Phylogenic relationship among 2 6 genotypes belonging to 5 sections of the genus Populus (S. matsudana as outgroup) based on 
30 polymorphic genie SSR markers. The different colour branches denote the divergent clusters. 



were identified in the corresponding sequences of P. 
trichocarpa, of which 1 7 (68%) were derived from 
promoter regions (Table 2). Additionally, 1 00% of 
SSR markers from exon regions could be directly 
employed in genetic studies of P. trichocarpa (Table 2). 

We determined whether the mutations in SSR 
repeat motifs were transitions, transversions, in- 
sertions, or deletions related to the P. tomentosa-P. 
trichocarpa comparison. An insertion of even a single 
base in the repeat motif, e.g. (AT) 5 T (AT) 3i and a de- 
letion within the repeat tract, e.g. (AGG) 3 AG_ 
(AGG) 2 , disrupted the continuity of the perfect 
repeat units. Transversions (38/63) were the most 
abundant mutations in the sequences of these two 
species, whereas insertions (4/63) were the least 
abundant (Table 3). Transitions and/or transversions 
accounted for 86% of the total mutations in the 
perfect repeats, whereas insertions and/or deletions 
were found in only 14%. T/C transversions were the 
most common substitutions that disrupted repeat 
continuity (Table 3). 

3.5. Inheritance test of genie SSR markers 

To further validate the usefulness of these genie SSR 
markers in genetic studies, we genotyped a randomly 
selected subset of 30 of the markers in a linkage 
mapping population of 1000 progeny from a con- 
trolled hybridization and tested for Mendelian segre- 
gation. Of the 30 genie SSRs, 5 loci (1 7%) were not 
used for analysis due to the presence of null alleles 



or unexpected lengths in the female parent. The 
remaining 2 5 loci segregated in the population 
(Table 4). Of the segregating loci, 2 were heterozy- 
gous in the female parent only, 4 were heterozygous 
in the male parent only, 1 7 were heterozygous in 
both parents, and 2 (SSR21 1 and SSR249) were 
homozygous in both parents, and thus resulted in off- 
spring identical to the parents and with the expected 
heterozygote genotype (Table 4). 

A chi-squared test was used to compare the segre- 
gation ratios of the 2 3 informative markers in these 
1000 offspring. Eighteen were in accordance with 
Mendelian expectations (P>0.01), with a segrega- 
tion ratio close to 1:2:1 for 3 SSR loci, 1:1 for 5 loci, 
and 1:1:1:1 for the remaining 10 markers (Table 4). 
Thus, these novel genie SSR markers may represent a 
useful resource for linkage mapping in P. tomentosa. 

3.6. Genie SSR polymorphisms associated with wood 
property traits 
Finally, to directly show that these markers could 
also prove useful for association mapping, we con- 
ducted a trial association mapping study using a 
subset of the genie SSRs to associate with traits affect- 
ing wood properties in Populus. For a random selec- 
tion of 40 genie SSRs, single-marker association tests 
(240; 40 SSRs x 6 traits) were conducted using 
MLM. Twenty associations were found to be signifi- 
cant at a threshold of P<0.05 (Supplementary 
Table S8). Multiple test corrections using the FDR 
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Table 2. Comparison of variations and mutations of 220 
trichocarpa 


genie SSR loci 


and their flanking sequences 


between P. 


tomentosa 


and P. 


Tunpc of \/3 nat in riQ anrl mi itatinnQ 
i y ulj ui vai idLiui Ij a I iu i i i u Lei i iu I 1 3 


P rr> m cytp rQ 

1 1 UI 1 IULC [ Z> 


Introns 


5'UTRs 


3'UTRs 


Ixons 


Total 


Mutation in the flanking sequence only 


5 


3 


2 


1 


0 


1 1 


Mutation in the flanking sequence and repeat number 


24 


24 


6 


6 


0 


60 


Mutation in the flanking sequence and no SSR marker 


7 


6 


0 


2 


0 


1 5 


Mutation in the flanking sequence and new SSR marker 


4 


2 


0 


0 


0 


6 


Variation of repeat number only 


1 8 


24 


1 0 


3 


2 


57 


Mutation in SSR repeat motif only 


1 


4 


3 


1 


0 


9 


Mutation in SSR repeat motif and repeat number 


3 


1 


0 


0 


0 


4 


No SSR marker only 


1 
1 


1 


1 


1 
i 




A 
4- 


New SSR marker only 


1 7 


5 


3 


0 


0 


25 


Identical sequence 


9 


7 


6 


1 


6 


29 


Total 


89 


77 


31 


1 5 
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Figure 4. Alignment and comparison of variations and mutations in genie SSRs with sequences homologous between P. tomentosa (01 ) and 
P. trichocarpa (02), based on a number of genie SSRs. Ten types of mutations and variations for genie SSR loci with their flanking regions 
were identified (See Table 2 for details). 



Table 3. Number and type of mutations found in perfect microsatellite motifs in P. tomentosa and P. trichocarpa 



Transition (1 6) 


Transversion (38) 






Deletion (5) 


Insertion (4) 


A^T C^G 


A^G A^C 


T^C 


T^G 


A(1),T(2),C(1),G(1) 


A(2),T(2),C(0),C(0) 


1 3 (24%) 3 (6%) 


1 1 (20%) 5 (9%) 


17(31%) 


5 (9%) 







method reduced this number to 1 2 at a significance 
threshold of Q < 0.1 0 (Table 5). These loci accounted 
for the phenotypic variance, with individual effects 
ranging from 4.0 to 8.9% (Table 5). Of these, 



a-cellulose and fibre length had three significant 
associations each, and holocellulose, MFA, and lignin 
had two significant associations each. However, no 
association with fibre width was detected (Q < 0.1 0, 
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Table 4. Mendelian inheritance patterns of 30 genie SSR markers based on segregation analyses of a cross between a female YX01 clone 
(P. alba x P. glandulosa) and a male LM 50 clone (P. tomentosa) 



Locus 


The allele genotype of P1 (5) The allele 

genotype of P2(d") 


The segregation genotypes of Fi progeny 


The expected 
segregation ratio 


P-value 


SSR2 


331 /341 (+) 


329/335 (+) 


329/331, 329/341, 331/335, 335/341 


1:1:1:1 


NS 


SSR7 


1 56/1 62 (+) 


1 59/1 65 (+) 


1 56/1 59, 1 56/1 65, 1 59/1 62, 1 62/1 65 


1:1:1:1 


P< 0.01 


SSR9 


25U/256 (+) 


T c rx 1 T C C ( \ \ 

IbO/lbb 


250, 250/256, 256 


1:2:1 


NS 


SSR1 3 


1 55/1 64 (+) 


1 55/1 58 (+) 


1 55, 1 55/1 58, 1 55/1 64, 1 58/1 64 


1:1:1:1 


NS 


SSR1 6 


341 ( — ) Unexpected size 


363 ( — ) 


/ 


/ 


/ 


SSR21 


1 67/1 79(+) 


1 69/1 75(+J 


1 67/1 69, 1 67/1 75, 1 69/1 79, 1 7 5/1 79 


1:1:1:1 


NS 


SSR33 


92 (-) 


92/98 (+) 


92, 92/98 


1:1 


NS 


SSR44 


I 5 1 / 1 5o (+; 


1 56 (-J 


1 51/1 56, 1 56 


1:1 


P< 0.01 


SSR45 


188/192 (+; 


1 92/1 9o (+J 


1 88/1 92, 1 88/1 96, 1 92/1 96, 1 92 


1:1:1:1 


NS 


SSR47 


1 86/1 95 (+) 


192/195 (+) 


1 86/1 92, 1 86/1 95, 1 92/1 95, 195 


1:1:1:1 


P< 0.01 


SSR50 


233/240 (+) 


233/240 (+) 


233, 233/240, 240 


1:2:1 


NS 


SSR52 


1 45 (-) 


1 45/1 48 (+J 


145, 145/148 


1:1 


NS 


SSR56 


228 (-) 


221 /228 {+) 


221 /228, 228 


1:1 


NS 


SSR61 


No amplification 


237 (-) 


/ 


/ 


/ 


SSR63 


202 (-) 


202/206 (+) 


202, 202/206 


1:1 


NS 


SSR71 


204/2 1 0 (+) 


^ f\ a /nil 

204/21 3 (,+; 


204, 204/21 0, 204/21 3, 210/213 


1:1:1:1 


NS 


SSR77 


No amplification 


171/175 (+) 


/ 


/ 


/ 


SSR1 07 


1 69/1 72 (+) 


1 69 {-) 


1 69, 1 69/1 72 


1:1 


NS 


SSR1 08 


1 —7—7 I A OH / i *\ 

1 77/1 83 (+) 


1 71 /I 79 (+) 


1 71/1 7 7, 1 77/1 79, 1 71/1 83, 1 79/1 83 


1:1:1:1 


NS 


SSR1 1 4 


1 ni /1 1 n ^ i \ 
1 02/1 1 0 (+) 


l 04/1 08^+J 


1 02/1 04, 1 02/1 08, 1 04/1 1 0, 1 08/1 1 0 


1:1:1:1 


P< 0.01 


SSR1 1 8 


191/1 94 (+) 


191/197 (+) 


1 91/1 97, 1 91/1 94, 1 94/1 97, 191 


1:1:1:1 


NS 


SSR1 24 


1 92/1 96 (+) 


1 88/1 92 (+) 


1 88/1 92, 1 88/1 96, 1 92/1 96, 1 92 


1:1:1:1 


NS 


SSR142 


227/231 (+) 


229/237 (+) 


22 7/229, 227/23 7, 2 29/2 31, 231/237 


1:1:1:1 


NS 


SSR1 50 


No amplification 


240/244 (+) 


/ 


/ 


/ 


SSR1 88 


207/21 1 (+) 


207/21 3 (+) 


207, 207/21 3, 207/21 3, 21 1/21 3 


1:1:1:1 


P< 0.01 


SSR1 97 


430 (-) Unexpected size 


41 7/419 (+) 


/ 


/ 


/ 


SSR2 1 1 


157 (-) 


160 (-) 


1 57/1 60 


/ 


/ 


SSR233 


291 /295 (+) 


289/299 (+) 


289/291, 289/295, 291/299, 295/299 


1:1:1:1 


NS 


SSR249 


1 79 (-) 


179 (-) 


1 79 


/ 


/ 


SSR2 54 


283/289 (+) 


283/289 (+) 


283, 283/289, 289 


1:2:1 


NS 


See Supplementary Table S6 for further details of these 30 genie SSR markers. 

'P1 ' represents the female clone 'YX0T {P. alba x P. glandulosa), 'P2' represents the male clone 'LM 50' (P. tomentosa), '+' 
represents heterozygote, and '-' represents homozygote; the x 2 significance level was P< 0.01, Ns, not significant; /, Not 



applied. 



Table 5). The 1 2 associations represent 9 SSR loci 
from different regions of 9 candidate genes 
(Table 5). For example, SSR205 was located in the 
coding region (PtoC4H1 exon 1) that was associated 
exclusively with lignin (R 2 = 8.9%, Q = 0.0211) 
(Table 5). For the holocellulose trait, SSR47 and 
SSR163, located in the non-coding regions of two 
candidate genes (PtoKorB and PtoCslA4), accounted 
for 4.3-8.7% of the phenotypic variance, and the 
SSR47 marker was similarly associated with a-cellu- 
lose content (Table 5). 



4. Discussion 

4.1 . Development and characterization of genie SSR 
markers 

Here, we used a candidate gene approach to iden- 
tify a set of SSR markers in P. tomentosa genes for 
wood properties, showing that this approach can 
provide useful genomics resources for linkage or asso- 
ciation mapping and eventually for marker-assisted 
breeding to improve wood quality in this important 
timber crop. We successfully mined for genie SSRs in 
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the sequences of 1 50 candidate genes associated with 
wood property traits. Our analysis of these SSR 
markers from P. tomentosa demonstrated that this ap- 
proach may be useful for characterization of other 
pathways in P. tomentosa and for development of mo- 
lecular markers for related model species with similar 
genomic resources. In our study, the frequency of 
genie SSRs was ~1 SSR/1.3 kb. Higher or lower fre- 
quencies of genie SSRs have been reported else- 
where. 1 1,26,27 However, these frequency variations 
may be the result of the application of different SSR 
search criteria, methods, or the abundances of the 
sources of DNA sequences searched. 8 

Dinucleotide motifs were the most abundant genie 
SSR markers (Fig. 2A and Supplementary Table S5A). 
This result is in contrast to previous findings identify- 
ing trinucleotide motifs as the most frequent genie 
repeats in most plant species, 8,28,29 which may be at- 
tributable to the sources of the DNA sequences used 
(i.e. EST, cDNA, or gene sequences). In general, previ- 
ous genie SSR makers were located in EST databases, 
and information about promoters and introns was 
not considered. Genie microsatellites are located 
mainly in promoters, introns, and UTRs of sequenced 
genes and are found at a lower frequency in conserved 
exons. 8 This agrees with the distribution pattern we 
report here (Fig. 2A), suggesting that genie microsatel- 
lites may have a role in regulation of gene expres- 
sion. 8,1 1,28-30 Furthermore, polymorphism tests of 
1 5 wild P. tomentosa ecotypes indicated that exonic 
SSRs contained less allelic variability than did non- 
coding SSRs (Supplementary Table S7B), which reflects 
the higher selection pressure on the coding portion of 
the genome. 31,32 

Our analysis of gene-derived SSRs found that SSRs 
with AT/TA motifs (32.6%) were the most frequent. 
This is similar to the cases of P. trichocarpa and 
Arachis hypogaea. 25 ' 33 Previous studies have indicated 
that (AT)n is the most common dinucleotide motif in 
plant genomes. 8,33 Thus, the genie SSR pattern identi- 
fied in this study is likely to be a good reflection of 
genome-wide SSR frequencies. In trinucleotide SSRs, 
the polymorphism rate of the ATT/TAA SSRs is three 
times higher than that of AAT/TTA SSRs. This distribu- 
tion suggests that AT-rich SSR loci may have a relatively 
high variability in P. tomentosa and also confirms the 
finding that AT-rich repeats (those repeats containing 
two or more A and/or T nucleotides per motif) are 
more common in non-coding regions. 9,1 7,25,34 

In this study, 91.8% of the 292 optimal genie SSR 
primers produced at least 2 clean amplified bands, 
possibly due to the naturally occurring excess of het- 
erozygotes in the P. tomentosa genome. 1 2 Allelic diver- 
sity estimated for these polymorphic markers was an 
average of 3.6 alleles per locus, (range 2-7 alleles), 
and the value is lower than the N A (4.3) in a 



population of 460 P. tomentosa and some other 
related species, such as Populus nigra, P. trichocarpa, 
Populus tremuloides, and Populus euphratica. 6,35 ' 36 
The level of allelic diversity reported in this study 
may be due to the limited sample size and/or the 
relatively conserved character of genie SSR markers. 
For generation of genetic maps, genie SSRs can deter- 
mine the relative positions of transferable markers 
and directly compare candidate gene-containing 
SSRs and quantitative trait locus (QTL) locations 
across a broad variety of genetic back- 
grounds. 8,25,29,32 The result of inheritance tests for 
30 genie SSR markers suggests that many genie SSRs 
can be used for linkage mapping in P. tomentosa, 
and they are also useful for QTL and marker-aided se- 
lection of important traits. In addition, segregation 
distortion is increasingly recognized as a potentially 
powerful evolutionary force that may be beneficial 
for QTL mapping. 37,38 The actual causes of the 
observed segregation distortions for markers are 
genes subjected to gametic or zygotic selection. 37 
Studies have suggested that epistasis contributes to 
segregation distortion, and segregation distortion 
may also be important for the evolution of many fun- 
damental aspects of sexual reproduction. 37-39 

The presence of SSRs in transcribed regions can 
result in changes in function, transcription, or transla- 
tion. For example, SSRs in coding regions that result in 
amino acid changes can cause either loss or gain of 
function, SSRs in the 5'UTR can affect gene transcrip- 
tion or translation, SSRs in the 3'UTR can be respon- 
sible for gene silencing or transcription slippage, and 
SSRs in introns might act as transcriptional enhancers 
of gene expression. 8,32,40 Previous study that phenyl- 
alanine ammonia-lyase (PAL) transcripts have been 
localized to develop xylem cells in aspen (P tremu- 
loides) stem, was consistent with its involvement in 
lignin biosynthesis. 41 This report supported the iden- 
tification of a single-marker non-coding association in 
PtoPALl that explained 4.6% of the phenotypic vari- 
ation in lignin content (Table 5). Cinnamate 4-hydro- 
xylase (C4H1) is proposed to be associated with G 
lignin deposition, 42,43 and a marker with significant 
association was located in the coding region 
(PtoC4H1 exon 1) that was associated exclusively 
with lignin (P 2 = 8.9%, Q = 0.0211) (Table 5). 
Physiological studies of C4H genes describe unique 
functions for the isoforms within the lignin biosyn- 
thetic pathway. 43 Similarly, other significant associa- 
tions located in different candidate genes, such as 
sucrose synthase (SUSY), Cellulase (KOR), and 
Cinnamoyl-CoA reductase (CCP) (Table 5), were also 
supported by studies finding that they were involved 
in lignocellulosic cell wall development. 3,14-16,43-45 
Association analysis in this study suggests that genie 
SSRs have considerable power in candidate gene- 
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Table 5. Significant SSR marker-trait pair associations in the 
natural P. tomentosa population (n — 460), after correction for 
multiple testing [FDR (Q) <0.10] 



Trait 


Gene symbol 


Locus 


P 


R 2 
(%) 


Q 


Lignin 


PtoPAL 2 


SSR1 98 


0.0021 


4.6 


0.0250 




Intronl 












PtoC4H1 


SSR205 


0.001 6 


8.9 


0.021 1 




Exon 1 










Holocellulose 


PtoKorB 


SSR47 


0.0031 


7.0 


0.0371 




Intron 3 












PtoCslA4 


SSR1 63 


0.0106 


6.4 


0.0475 




Promoter 










a-CelluIose 


PtoKorB 


SSR47 


0.0095 


4.0 


0.0475 




Intron 3 












PtoWuA 


SSR53 


0.0200 


4.5 


0.0702 




Promoter 












PtoSuSy6 


SSR71 


0.0001 


7.4 


0.01 72 




Promoter 










MFA 


PtoPAL2 


SSR1 98 


0.0205 


8.0 


0.0750 




Intron 1 












PtoCCR2 


SSR224 


4.3 3 E- 


7.5 


0.01 05 




5'UTR 




04 






Fibre length 


PtoExp 1 0 


SSR79 


0.0022 


5.3 


0.0278 




Promoter 












PtoCslA4 


SSR1 63 


0.01 75 


5.5 


0.0669 




Promoter 












PtoCslD9 


SSR1 87 


0.0027 


6.4 


0.0346 



Exon 4 

P=the significant level for association (the significance is 
P<0.05); R 2 = percentage of the phenotypic variance 
explained; Lignin = lignin content; Holocellulose = holocel- 
lulose content; and a-cellulose = a-cellulose content. 

based association studies to identify allelic variation in 
genes associated with important wood quality 
traits, 20 and these markers have the potential to be 
useful in genetic improvement of lignin and cellulose 
biosynthesis in poplar. 

4.2. SSR variation among related species within the 
genus Populus 
The genus Populus contains six subgenera: Abaso, 
Leuce, Leucoides, Aigeiros, Turanga, and Tacamahaca, 
and Turanga is the most distant from 
/.ewce. 1 0,24,25,46,47 We found that genie SSR markers 
in P. tomentosa have relatively high amplification 
rates among closely related taxa, including species of 
the Leuce, Tacamahaca, Aigeiros, and Leucoides sections, 
but a lower amplification rate in Turanga (Table 1). 
This successful amplification rate in different sections 
was positively correlated with mean N A and PIC values 
(Table 1 ) that are due mainly to the phylogenetic di- 
vergence from P. tomentosa and is also related to the 
sampling species and their genetic backgrounds in 
each section. Furthermore, the amplification success 
rates were higher for members of the other subgen- 
era, with the exception of Turanga (Table 1 ), 



suggesting that the amplification frequency varies in 
tandem with evolutionary distance, and this may be 
because the flanking regions of the microsatellite, 
where the PCR primers bind to the DNA, are more 
similar in phylogenetically close species than phylo- 
genetically distant species. 1 0,1 7,25,32 It should, there- 
fore, be possible to predict the utility of these 
primers according to the genetic distance from P. 
tomentosa. 

Six distinct groups were identified within 2 7 
samples, based on 30 genie SSR markers (Fig. 3). 
The dendrogram grouping showed the expected seg- 
regation of the different sections within the genus 
Populus, which agreed with the subgenus botanical 
classification level of Salicaceae, 24 ' 25 ' 46 ' 47 although a 
few species were not identified based on their puta- 
tive relationships or as reported in other diversity 
studies. 46,47 For example, Populus bolleana is a 
variety of Populus alba; however, these species did 
not group together in the closest branches, despite 
their shared ancestry (Fig. 3). Overall, this analysis 
roughly supports the botanical classification of the 
germplasm surveyed and hints at the usefulness of 
the genie SSR markers for genetic diversity studies 
and other Populus genotyping applications. 

In a comparison of genie SSR variations using hom- 
ologous P. tomentosa and P. trichocarpa sequences 
(Table 2), we identified both repeat motif variations 
and mutations in the flanking sequences in most 
markers, and this shows that microsatellite mutation 
patterns are often complex in cross-species amplifica- 
tion. However, our findings indicate that the effects of 
mutations accumulated over evolutionary time can 
readily be studied. Point mutations disrupted the 
repeat pattern, and in addition to a new class of 
repeat motifs, size variations at microsatellite loci 
caused by indels (insertions or deletions) in flanking 
sequences have also been reported. 32,40,48,49 We 
found the highest proportion of SSRs with mutations 
in flanking sequences in the 3'UTR (60.0%) and the 
lowest in the 5'UTR (2 5.8%). This suggests that varia- 
tions and mutations in microsatellites may be influ- 
enced by the nature and functional composition of 
the flanking sequences. 48,49 Transitions and/or trans- 
versions accounted for 86% of the total mutations of 
perfect repeat motifs in the two species (Table 3), in- 
dicating that base substitution of microsatellites is 
more common than length variation of homologous 
loci between closely related species. 49 It also suggests 
that base substitution data are vital because genotyp- 
ing using SSR alleles among related species is often er- 
roneous, if based only on microsatellites of identical 
allele lengths obtained in electrophoresis. In silico 
identification of genie SSR variations in Populus is par- 
ticularly useful for evolutionary studies and for in- 
creasing our understanding of the origin, mutational 
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processes, and structure of microsatellites. Further re- 
ciprocal studies of microsatellite allelic variation in 
different species would shed additional light on direc- 
tional evolution and genetic toxicology. ,48 

Overall, the reliability of comparison of microsatel- 
lite variation data among related species is higher 
than cross-species amplification success rates or poly- 
morphisms. Comparison of microsatellite variation in 
these two species suggests that use of cross-species 
SSR primers to investigate functionally important 
allelic polymorphisms related to the traits of interest 
may be inefficient. Therefore, the direct development 
of species-specific primers is necessary for association 
mapping, due to the variation in repeat number only 
at each SSR locus (conserved flanking sequences) 
among all individuals of the same species. 
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